Estimating individualized treatment rules is a central task for personalizedmedicine. [zhao2012estimating] and [zhang2012robust] proposed outcome weightedlearning to estimate individualized treatment rules directly through maximizingthe expected outcome without modeling the response directly. In this paper, weextend the outcome weighted learning to right censored survival data withoutrequiring either an inverse probability of censoring weighting or asemiparametric modeling of the censoring and failure times as done in[zhao2015doubly]. To accomplish this, we take advantage of the tree basedapproach proposed in [zhu2012recursively] to nonparametrically impute thesurvival time in two different ways. The first approach replaces the reward ofeach individual by the expected survival time, while in the second approachonly the censored observations are imputed by their conditional expectedfailure times. We establish consistency and convergence rates for bothestimators. In simulation studies, our estimators demonstrate improvedperformance compared to existing methods. We also illustrate the proposedmethod on a phase III clinical trial of non-small cell lung cancer.
展开▼